Data biography: Ku Klux Klan (KKK) Ledgers in the Greater Denver area¶
The Ku Klux Klan (KKK) is one of the most famous white supremacist organizations in the history of the United States, which was first founded in 1865 after the end of the Civil War, advocating to defend the interests of whites and oppose the freedom and civil rights of blacks and immigrants. The group reached the height of its expansion in the 1920s, with members infiltrating local government, law enforcement and other organizations.
Colorado in the 1920s was one of its active areas, and the core object of this dataset is the Ku Klux Klan Ledgers from Greater Denver (1924-1926). This is a historical archive that is publicly available at History Colorado, and the dataset contains detailed information on the Denver area KKK members and their associates during 1924-1926, including names, addresses, and telephone numbers etc.
Who?¶
These ledgers were originally collected by the Ku Klux Klan in the 1920s for the organization's membership management and activity arrangements in the Denver area. At the time, the Ku Klux Klan was so powerful in Colorado that they manually entered detailed member information, such as names, addresses, and contact information, into the ledger for internal administrative use.
The physical ledger was anonymously donated to Colorado History in 1946 through a staff member of the Rocky Mountain News. And then History Colorado has preserved the ledgers and digitizing them. The museum makes the ledger public, along with its digital form, for the public to view and download on its website.
When & Where?¶
The information was collected between 1924 and 1926 by the Ku Klux Klan in and around the Greater Denver area. The ledger records reflect the organization's extensive social penetration and organizational management. In 2021, the Colorado History Museum digitized the data and made it available to the public and researchers as part of its historical archives.
import pandas as pd
kkk_df = pd.read_csv('kkk-ledgers-index.csv', low_memory=False)
import plotly.express as px
city_counts = kkk_df['residenceCity'].value_counts().nlargest(11).reset_index()
city_count = city_counts.tail(10)
fig = px.bar(
city_count,
x='residenceCity',
y='count',
title='Number of persons residing outside Denver',
labels={'residenceCity': 'City', 'count': 'Count'}
)
fig.show()
pd.set_option('display.max_columns', None)
sample_df = kkk_df[['fullName', 'Business Address']].dropna().sample(20)
sample_df
| fullName | Business Address | |
|---|---|---|
| 5278 | Bruce H Foster | 23rd & Blake |
| 1116 | Wm H Hedley, Jr. | 1636 Champa |
| 10816 | Allison R Beckmann | 124 W 14th Ave |
| 2101 | Chas H Burnham | 1811 Glenarm St. |
| 8929 | Clarence Edwin Fraser | 1200 Bannock |
| 3033 | Lyle Abram Holland | 2737 W 27th Ave |
| 8355 | Jack C Dister | 1408 Curtis |
| 5738 | Darrell W Meacham | 2900 Welton |
| 16250 | Wm W Fritts | 1119 18th St |
| 7543 | Fred'k C Butterfield | 1632 Court Pl |
| 3500 | Valentine Emil Joerger | 1916 Blake |
| 12183 | Wm A Trowbridge | 515 Kittredge Bldg, 16th & Glenarm* |
| 14040 | Frank Henson | 3825 Tennyson |
| 6200 | M J Roberts | 1541 Lincoln |
| 8090 | Roy S Walters | 547 Galapago |
| 14310 | Andrew C Campbell | 2424 East Colfax |
| 4905 | Theodore F Clark, Jr. | 736 14th St |
| 9005 | Lucion Grady Hubbard | 516 Foster Bldg, 16th & Champa* |
| 6915 | Henry M Duff | 1736 Platte St |
| 9927 | John Henry Long | 17th & Broadway* |
The above two charts show the influence of the KKK at that time, in the Denver area alone, they had infiltrated various organizations and institutions, and expanded beyond Denver.
How?¶
These ledgers were originally recorded by manual writing and included different personal information. Since its use is primarily for internal management, this information has a great level of detail.
The post-processing process includes scanning the ledger into PDF images, using OCR for text recognition, and then manually reviewing and converting to CSV format for easy data analysis. The History Colorado provides viewing of PDF images and CSV files.
non_counts = kkk_df.notnull().sum().sort_values(ascending=False).tail(23).reset_index()
non_counts.columns = ['1', 'NonNullCount']
fig = px.bar(
non_counts,
x='1',
y='NonNullCount',
title='Data filling status',
labels={'1': '', 'NonNullCount': 'count'} ,
color='NonNullCount',
color_continuous_scale='Inferno_r',
template='plotly_white'
)
fig.show()
We can see that in addition to recording names, the amount of other member-related records in the ledger was almost halved, which shows that the KKK focused on names when recording this ledger, and addresses and other information were probably not considered.
This may also be because some people are reluctant to give out their personal information (after all, they are joining an unofficial organization). This just goes to show that the Ku Klux Klan, as a racist organization, doesn't need much information to keep its members in touch (you can even become a member just by filling in your name).
Why?¶
The original intention of the Klan to collect this data may be to manage internal organization, collect dues, and monitor social networks, hoping to strengthen organizational influence through the management of members.
In 21th Century, History Colorado has shifted its overt purposes to education, historical transparency and social reflection. By exposing these historical materials related to racism, the public can better understand the social impact of extremism and provide real and powerful material support for social education.
The original ledger can be seen at the Colorado History Center, and on the internet, these data are stored in two forms: one is a PDF image file, which retains the original appearance of the ledger, which is easy to historical comparison and intuitive reading. The second is CSV file, suitable for structured analysis.
Data fields include full names, addresses, phone numbers, business addresses, member numbers, ledger page numbers, and supplementary fields such as "symbolExist" and "Note & Remarks". The author uses the CSV format file provided by History Colorado, and uses Python language to read and analyze in Jupyter Notebook.
The dataset has some shortcomings in record integrity, as noted on the website, with the first 69 records missing and large empty values in address and telephone information. Also, colum such as "Notes & Remarks" use many abbreviations or specific code names and may require more specialized historical context for people to understand.
The data cover only 1924-1926 and are spatially limited to the greater Denver area, so it may not be a complete picture of the Klan nationwide.
notes = kkk_df[['fullName', 'Notes & Remarks']].dropna().sample(n=10, random_state=42)
notes.reset_index
notes
| fullName | Notes & Remarks | |
|---|---|---|
| 440 | Harry R Miller | Name struck; "DECEASED" |
| 10059 | Harry Ellsworth Meloeny | Name struck; "DECEASED" |
| 21001 | Leslie E Keithline | Arvada - Paid Kerk Aug. 8, 1924 #36 |
| 26731 | George H Goulden | Englewood to Brock 10/18 |
| 4439 | Roy E Merritt | Name struck; "RESIGNED 6-15-26" |
| 28311 | George A Zuber | Rejected ref. 114506 |
| 22593 | William C Callahan | Rejected; Ref 9/18 142127 |
| 17049 | John R Buckwalter | Says he can't go there; Returned his own check |
| 21491 | Terry J Miller | Littleton - Paid Kerk Aug. 8, 1924 #36 |
| 26372 | Ames A Martin | Rejected ref. 113529 |
As you can see, in 10 randomly selected lines, the information contained in the Notes & Remarks is almost completely unintelligible.
One interesting point in this ledger is that it has a column that counts whether there is a symbol on each member(symbolExist), which I think can be used to determine which KKK members are real and which are just related.
symbol_counts = kkk_df['symbolExists'].value_counts().reset_index()
symbol_counts.columns = ['Symbol', 'Count']
fig = px.pie(symbol_counts, names='Symbol', values='Count',
title='People with Symbol')
fig.show()
According to the chart above, the number of members with symbols is only 25.4%, which may indicate that there are not that many truly fanatical members of the Ku Klux Klan, and their number is greatly overestimated.
The Ku Klux Klan was a racist, xenophobic organization with deep historical roots that became a political force in several states in the United States during the 1920s. They oppose blacks, Jews, Catholics, immigrants, gays and other marginalized groups. When using this data, we must be wary of its political and racist tendencies and avoid inflicting secondary damage on the victims of history. And also, while the disclosure of data can be educational, it can also raise ethical and privacy concerns for the descendants of the people in the ledger. This data should be handled with respect and caution.
This dataset is meant to help people understand the organization and operation of the Ku Klux Klan in the 1920s, and shows that the collection and publication of data was never neutral. This data presents both a piece of the history of extremist groups and an important source of information for today's society as it confronts issues of discrimination, hatred and historical justice. People should continue to excavate the social structure hidden behind the data, strengthen the memory of historical injustice and the awareness of resistance, and make the data truly serve the goal of social progress, fairness and justice.